National Repository of Grey Literature 1 records found  Search took 0.01 seconds. 
Towards Machine Translation Based on Monolingual Texts
Kvapilíková, Ivana ; Bojar, Ondřej (advisor) ; Espana-Bonet, Cristina (referee) ; Čmejrek, Martin (referee)
Title: Towards Machine Translation Based on Monolingual Texts Author: Ivana Kvapilíková Institute: Institute of Formal and Applied Linguistics Supervisor: doc. RNDr. Ondřej Bojar, Ph.D., Institute of Formal and Applied Linguistics Abstract: The current state of the art in machine translation (MT) heavily relies on parallel data, i.e. texts that have been previously translated by humans. This type of resource is expen- sive and only available for several language pairs in limited domains. A new line of research has emerged to design models capable of learning to translate from monolingual texts which are signicantly easier to obtain, e.g. by web-crawling. While it is impressive that such models achieve translation capabilities, the translation quality of the output they produce is still low for practical applications. This dissertation thesis strives to improve their performance. We explore the existing approaches of using monolingual resources to train translation models and propose a new technique to generate pseudo-parallel training data articially without expensive human input. We automatically select similar sentences from monolingual corpora in different languages and we show that using them in the initial stages of MT training leads to a signicant enhancement in translation quality. We also...

Interested in being notified about new results for this query?
Subscribe to the RSS feed.